Date created: June 10, 2022
Author: P. Alexander Burnham

Summary Last time we set up several cubes from geotif files and explored some of the basic querrying and cube building functionality of cubetime. However, all of the files we were working with, had the same scales and grid sizes. Spacetime has the functionality to rescale rasters of different extents, grid sizes, epsg codes etc. and create a cleaned cube where all of those factors are consistent accross the data layers.

Loading file names and creating a file object

Let’s load two rasters of data from India. We can then load the data up as a file object and compare some attributes between them.

# read files and load data
data = ["demoData/LULC_1995.tif", "demoData/India_cropped-area_1km_2016.tif"]
ds = read_data(data)
# check number of bands
ds.number_of_bands()

# check the epsg codes
## [1, 1]
ds.extract_epsg_code()

# check the raster dimensions
## ['EPSG:32644', 'EPSG:4326']
ds.get_raster_dimensions()

# check pixel size
## [(29484, 33555), (2373, 3269)]
ds.pixel_size()

# check no data values
## [100.0, 0.008868160000000002]
ds.get_nodata_value()
## [127.0, -1.7e+308]

Let’s plot the two files and compare them.


# Get the array data
arrays = ds.get_raster_data()

# plot file 1
plt.imshow(arrays[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7fce27b39850>
plt.show()

# plot file 2

plt.imshow(arrays[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7fce279b8bb0>
plt.show()

Let’s rescale these files using some spacetime functions.

Apart from the fact that both datasets are from approximately the same region, we can see that almost all of the attributes between these two files are different. Using raster_scale() we can get these two files to have the same spatial reference system,, pixel size and nodata value with cells aligned to the same lat and lon values.

# lets align these files
alignedDS = raster_align(data=ds, resolution=.08, SRS=4326, noneVal=127)

Let’s look at these attributes now that we have run the align function.


# check epsg codes
alignedDS.extract_epsg_code()
# get raster dimensions
## ['EPSG:4326', 'EPSG:4326']
alignedDS.get_raster_dimensions()
# pixel size
## [(409, 383), (264, 363)]
alignedDS.pixel_size()
# check no data values
## [0.08, 0.08]
alignedDS.get_nodata_value()
## [127.0, 127.0]

Lets also look at the rescaled images.

# get rescaled data
scaledArray = alignedDS.get_raster_data()

# plot file 1
plt.imshow(scaledArray[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7fce27a067f0>
plt.show()

# plot file 2

plt.imshow(scaledArray[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7fce27a0a3a0>
plt.show()

Upscaling and downscaling rasters automatically

Let’s upsale the lower resolution raster using the default nearest r algorithm

# finest grain resolution
alignedDS_max = raster_align(data=ds, resolution="max")
alignedDS_max.pixel_size()

# plot the file
## [100.0, 100.0]
scaledArray_max = alignedDS_max.get_raster_data()
plt.imshow(scaledArray_max[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7fcc5626c760>
plt.show()

Next we will downscale the higher resolution image using the “min” option using the same algorithm

# coarse resolution
alignedDS_min = raster_align(data=ds, resolution="min")
alignedDS_min.pixel_size()

# plot the file
## [957.302191144903, 957.302191144903]
scaledArray_min = alignedDS_min.get_raster_data()
plt.imshow(scaledArray_min[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7fce27bb15b0>
plt.show()

Cropping file objects

Intersection We can see now that the only difference is the extent of the files. The first file covers more surface area of the globe. Lets crop the files to have the same greatest common bounding box, (intersection).

# trim the file object
trimmedDS_intersection = raster_trim(alignedDS, method = "intersection")
# lets compare the dimensions of the new objects
trimmedDS_intersection.get_raster_dimensions()
## [(264, 363), (264, 363)]

Comparing the plotted images reveals that the files are now propperly aligned and trimmed

# get data
trimmedArray = trimmedDS_intersection.get_raster_data()

# plot file 1
plt.imshow(trimmedArray[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7fce27d64d60>
plt.show()

# plot file 2

plt.imshow(trimmedArray[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7fce279d7430>
plt.show()

Union We can also chose to crop the files so that no data are lost. However, no data values will populate the areas that do not exist on certain rasters.

# trim the file object
trimmedDS_union = raster_trim(alignedDS, method = "union")
# lets compare the dimensions of the new objects
trimmedDS_union.get_raster_dimensions()
## [(409, 383), (409, 383)]

Let’s look at these new files which should have the larger extent

# get data
trimmedArray = trimmedDS_union.get_raster_data()

# plot file 1
plt.imshow(trimmedArray[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7fce27a055e0>
plt.show()

# plot file 2

plt.imshow(trimmedArray[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7fce27b2a5b0>
plt.show()

Specified Points We can also crop our files using user defined upper left and lower right coordinates. These are the mimum required values for specifying a bounding box.

# trim the file object
trimmedDS_box = raster_trim(alignedDS, method="corners", ul = [73, 25], lr = [81, 13])

# get dimensions
trimmedDS_box.get_raster_dimensions()
## [(100, 150), (100, 150)]

Let’s look at these files, next.

# get data
trimmedArray = trimmedDS_box.get_raster_data()

# plot file 1
plt.imshow(trimmedArray[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7fcca78403d0>
plt.show()

# plot file 2

plt.imshow(trimmedArray[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7fcca784fca0>
plt.show()

Shape Files We can also use a shape file to crop these images. I am using the higher resolution cube and New Delhi’s shape file.

# trim the file object
trimmedDS_shape = raster_trim(alignedDS_max, method="shape", shapeFile="demoData/DelhiShape/Districts.shp")

# get dimensions
trimmedDS_shape.get_raster_dimensions()
## [(495, 530), (495, 530)]

Let’s plot the result if the shape file cropping method.

# get data
trimmedArray = trimmedDS_shape.get_raster_data()

# plot file 1
plt.imshow(trimmedArray[0], vmin=0, vmax=17)
## <matplotlib.image.AxesImage object at 0x7fcd03fff2b0>
plt.show()

# plot file 2

plt.imshow(trimmedArray[1], vmin=0, vmax=100)
## <matplotlib.image.AxesImage object at 0x7fce27d776a0>
plt.show()

Make a cube

Now that we have cleaned data sets, we can assemble them into a cube as shown in the previous vignette. Lets start be creating a time object. cube_time() allows you to create a time vector that skips user-specified chunks of time using the skips arguemnt.

# create time vec
yearObj = cube_time(start="1995", length=2, scale = "year", skips = 21)

We will now create a cube by assigning files and bands to time.

cube = make_cube(data = trimmedDS_shape, fileName = "indiaCube.nc4", organizeFiles = "filestotime", organizeBands="bandstotime", timeObj = yearObj)

Finally, lets look at the data

cube.get_raster_data()
<xarray.DataArray (lat: 530, lon: 495, time: 2)>
array([[[127., 127.],
        [127., 127.],
        [127., 127.],
        ...,
        [127., 127.],
        [127., 127.],
        [127., 127.]],

       [[127., 127.],
        [127., 127.],
        [127., 127.],
        ...,
        [127., 127.],
        [127., 127.],
        [127., 127.]],

       [[127., 127.],
        [127., 127.],
        [127., 127.],
        ...,
...
        ...,
        [127., 127.],
        [127., 127.],
        [127., 127.]],

       [[127., 127.],
        [127., 127.],
        [127., 127.],
        ...,
        [127., 127.],
        [127., 127.],
        [127., 127.]],

       [[127., 127.],
        [127., 127.],
        [127., 127.],
        ...,
        [127., 127.],
        [127., 127.],
        [127., 127.]]], dtype=float32)
Coordinates:
  * lon      (lon) float32 9.33e+04 9.34e+04 9.35e+04 ... 1.426e+05 1.427e+05
  * lat      (lat) float32 3.201e+06 3.201e+06 3.201e+06 ... 3.148e+06 3.148e+06
  * time     (time) datetime64[ns] 1995-12-31 2016-12-31